

WHAT IS CLAIMED IS:

1           1. A processing core comprising:  
2           R-number processing pipelines each comprising N-number of processing  
3           paths, wherein each of said R-number of processing pipelines are synchronized to operate as  
4           a single very long instruction word (VLIW) processing core, said VLIW processing core  
5           being configured to process  $R \times N$ -number of VLIW sub-instructions in parallel.

1           2. The processing core as recited in claim 1 wherein said R-number of  
2           processing pipelines can be configured to operate independently as separately operating  
3           pipelines.

1           3. The processing core as recited in claim 1 wherein each of said R-  
2           number of processing pipelines comprises S-number of register files, such that said  
3           processing core comprises  $R \times S$ -number of register files.

1           4. The processing core as recited in claim 3 wherein each of said R-  
2           number of processing pipelines comprises one register file for every two of said N-number of  
3           processing paths, such that  $S = N/2$ .

1           5. The processing core as recited in claim 3 wherein each of said register  
2           files comprises Q-number of M-bit wide registers, and wherein said Q-number of registers  
3           within each of said register files are either private or global registers, and wherein when a  
4           value is written to one of said Q-number of said registers which is a global register within one  
5           of said register files, said value is propagated to a corresponding global register in the other  
6           of said register files, and wherein when a value is written to one of said Q-number of said  
7           registers which is a private register within one of said register files, said value is not  
8           propagated to a corresponding register in the other of said register files.

1           6. The processing core as recited in claim 1, wherein a single VLIW  
2           processing instruction comprises  $R \times N$ -number of P-bit sub-instructions appended together.

1           7. The processor chip as recited in claim 6, wherein  $M=64$ ,  $Q=64$ , and  
2            $P=32$ .

1           8. The processing core as recited in claim 3 wherein said each of said R-  
2           number of processing pipelines comprise an execute stage which includes an execute unit for

3 each of said N-number processing paths, each of said execute units comprising an integer  
4 processing unit, a load/store processing unit, a floating point processing unit, or any  
5 combination of one or more of said integer processing units, said load/store processing units,  
6 and said floating point processing units.

1                   9.       The processing core as recited in claim 8 wherein an integer processing  
2 unit and a floating point processing unit share one of said register files.

1                   10.      The processing core as recited in claim 5 wherein Q=64, and a 64-bit  
2 special register stores bits indicating whether registers in the register files are private registers  
3 or global registers, each bit in the 64-bit special register corresponding to one of the registers  
4 in the register files.

1                   11.      The processing core as recited in claim 5 wherein a plurality of said  
2 register files are connected to a bus, and a value written to a global register in one of said  
3 register files connected to the bus is propagated to a corresponding global register in the other  
4 of said register files connected to across bus across said bus.

1                   12.      The processing core as recited in claim 5 wherein a plurality of said  
2 register files are connected together in serial, and a value written to a first global register in a  
3 first of said plurality of register files is propagated to a corresponding first global register in a  
4 second of said plurality of register files connected directly to said first of said plurality of  
5 register files.

1                   13.      In a computer system, a scalable computer processing architecture,  
2 comprising:

3                   one or more processor chips, each comprising:  
4                   a processing core, including:  
5                   R-number processing pipelines each comprising N-number of processing  
6 paths, wherein each of said R-number of processing pipelines are synchronized to operate as  
7 a single very long instruction word (VLIW) processing core, said VLIW processing core  
8 being configured to process R x N-number of VLIW sub-instructions in parallel;  
9                   an I/O link configured to communicate with other of said one or more  
10 processor chips or with I/O devices;  
11                   a communication controller in electrical communication with said processing  
12 core and said I/O link;

13                   said communication controller for controlling the exchange of data between a  
14                   first one of said one or more processor chips and said other of said one or more processor  
15                   chips;

16                   wherein said computer processing architecture can be scaled larger by  
17                   connecting together two or more of said processor chips in parallel via said I/O links of said  
18                   processor chips, so as to create multiple processing core pipelines which share data  
19                   therebetween.

1                   14.    The computer system as recited in claim 13 wherein said R-number of  
2                   processing pipelines can be configured to operate independently as separately operating  
3                   pipelines.

1                   15.    The computer system as recited in claim 13 wherein each of said R-  
2                   number of processing pipelines comprises S-number of register files, such that said  
3                   processing core comprises  $R \times S$ -number of register files.

1                   16.    The computer system as recited in claim 15 wherein each of said R-  
2                   number of processing pipelines comprises one register file for every two of said N-number of  
3                   processing paths, such that  $S = N/2$ .

1                   17.    The computer system as recited in claim 15 wherein each of said  
2                   register files comprises Q-number of M-bit wide registers, and wherein said Q-number of  
3                   registers within each of said register files are either private or global registers, and wherein  
4                   when a value is written to one of said Q-number of said registers which is a global register  
5                   within one of said register files, said value is propagated to a corresponding global register in  
6                   the other of said register files, and wherein when a value is written to one of said Q-number  
7                   of said registers which is a private register within one of said register files, said value is not  
8                   propagated to a corresponding register in the other of said register files.

1                   18.    The computer system as recited in claim 13 wherein a single VLIW  
2                   processing instruction comprises  $R \times N$ -number of P-bit sub-instructions appended together.

1                   19.    The computer system as recited in claim 18 wherein  $M=64$ ,  $Q=64$ , and  
2                    $P=32$ . wherein  $M=64$ ,  $Q=64$ , and  $P=32$ .

1                   20. The computer system as recited in claim 15 wherein said each of said  
2 R-number of processing pipelines comprise an execute stage which includes an execute unit  
3 for each of said N-number processing paths, each of said execute units comprising an integer  
4 processing unit, a load/store processing unit, a floating point processing unit, or any  
5 combination of one or more of said integer processing units, said load/store processing units,  
6 and said floating point processing units.

1                   21. The computer system as recited in claim 20 wherein an integer  
2 processing unit and a floating point processing unit share one of said register files.

1                   22. The computer system as recited in claim 17 wherein Q=64, and a 64-  
2 bit special register stores bits indicating whether registers in the register files are private  
3 registers or global registers, each bit in the 64-bit special register corresponding to one of the  
4 registers in the register files.

1                   23. The computer system as recited in claim 17 wherein a plurality of said  
2 register files are connected to a bus, and a value written to a global register in one of said  
3 register files connected to the bus is propagated to a corresponding global register in the other  
4 of said register files connected to across bus across said bus.

1                   24. The computer system as recited in claim 17 wherein a plurality of said  
2 register files are connected together in serial, and a value written to a first global register in a  
3 first of said plurality of register files is propagated to a corresponding first global register in a  
4 second of said plurality of register files connected directly to said first of said plurality of  
5 register files.